153 research outputs found
Insignificant shadow detection for video segmentation
To prevent moving cast shadows from being misunderstood as part of moving objects in change detection based
video segmentation, this paper proposes a novel approach to the cast shadow detection based on the edge and region information in multiple frames. First, an initial change detection mask containing moving objects and cast shadows is obtained. Then a Canny edge
map is generated. After that, the shadow region is detected and
removed through multiframe integration, edge matching, and region growing. Finally, a post processing procedure is used to eliminate noise and tune the boundaries of the objects. Our approach
can be used for video segmentation in indoor environment. The experimental results demonstrate its good performance
Low-Light Image Enhancement with Illumination-Aware Gamma Correction and Complete Image Modelling Network
This paper presents a novel network structure with illumination-aware gamma
correction and complete image modelling to solve the low-light image
enhancement problem. Low-light environments usually lead to less informative
large-scale dark areas, directly learning deep representations from low-light
images is insensitive to recovering normal illumination. We propose to
integrate the effectiveness of gamma correction with the strong modelling
capacities of deep networks, which enables the correction factor gamma to be
learned in a coarse to elaborate manner via adaptively perceiving the deviated
illumination. Because exponential operation introduces high computational
complexity, we propose to use Taylor Series to approximate gamma correction,
accelerating the training and inference speed. Dark areas usually occupy large
scales in low-light images, common local modelling structures, e.g., CNN,
SwinIR, are thus insufficient to recover accurate illumination across whole
low-light images. We propose a novel Transformer block to completely simulate
the dependencies of all pixels across images via a local-to-global hierarchical
attention mechanism, so that dark areas could be inferred by borrowing the
information from far informative regions in a highly effective manner.
Extensive experiments on several benchmark datasets demonstrate that our
approach outperforms state-of-the-art methods.Comment: Accepted by ICCV 202
Structure-Preserving Graph Representation Learning
Though graph representation learning (GRL) has made significant progress, it
is still a challenge to extract and embed the rich topological structure and
feature information in an adequate way. Most existing methods focus on local
structure and fail to fully incorporate the global topological structure. To
this end, we propose a novel Structure-Preserving Graph Representation Learning
(SPGRL) method, to fully capture the structure information of graphs.
Specifically, to reduce the uncertainty and misinformation of the original
graph, we construct a feature graph as a complementary view via k-Nearest
Neighbor method. The feature graph can be used to contrast at node-level to
capture the local relation. Besides, we retain the global topological structure
information by maximizing the mutual information (MI) of the whole graph and
feature embeddings, which is theoretically reduced to exchanging the feature
embeddings of the feature and the original graphs to reconstruct themselves.
Extensive experiments show that our method has quite superior performance on
semi-supervised node classification task and excellent robustness under noise
perturbation on graph structure or node features.Comment: Accepted by the IEEE International Conference on Data Mining (ICDM)
2022. arXiv admin note: text overlap with arXiv:2108.0482
Self-Supervision Can Be a Good Few-Shot Learner
Existing few-shot learning (FSL) methods rely on training with a large
labeled dataset, which prevents them from leveraging abundant unlabeled data.
From an information-theoretic perspective, we propose an effective unsupervised
FSL method, learning representations with self-supervision. Following the
InfoMax principle, our method learns comprehensive representations by capturing
the intrinsic structure of the data. Specifically, we maximize the mutual
information (MI) of instances and their representations with a low-bias MI
estimator to perform self-supervised pre-training. Rather than supervised
pre-training focusing on the discriminable features of the seen classes, our
self-supervised model has less bias toward the seen classes, resulting in
better generalization for unseen classes. We explain that supervised
pre-training and self-supervised pre-training are actually maximizing different
MI objectives. Extensive experiments are further conducted to analyze their FSL
performance with various training settings. Surprisingly, the results show that
self-supervised pre-training can outperform supervised pre-training under the
appropriate conditions. Compared with state-of-the-art FSL methods, our
approach achieves comparable performance on widely used FSL benchmarks without
any labels of the base classes.Comment: ECCV 2022, code: https://github.com/bbbdylan/unisia
- …